Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher.
                                            Some full text articles may not yet be available without a charge during the embargo (administrative interval).
                                        
                                        
                                        
                                            
                                                
                                             What is a DOI Number?
                                        
                                    
                                
Some links on this page may take you to non-federal websites. Their policies may differ from this site.
- 
            Open source software (OSS) underpins modern software infrastructure, yet many projects struggle with long- term sustainability. We introduce OSSPREY, an AI-powered platform that can predict the sustainability of any GitHub- hosted project. OSSPREY collects longitudinal socio-technical data, such as: commits, issues, and contributor interactions, and uses a transformer-based model to generate month-by-month sustainability forecasts. When project downturns are detected, it recommends evidence-based interventions drawn from published software engineering studies. OSSPREY integrates scraping, forecasting, and actionable guidance into an interactive dash- board, enabling maintainers to monitor project health, anticipate decline, and respond with targeted strategies. By connecting real- time project data with research-backed insights, OSSPREY offers a practical tool for sustaining OSS projects at scale. The codebase is linked to the project website at: https: //oss-prey.github.io/OSSPREY-Website/ The screencast is available at: https://www.youtube.com/ watch?v=N7a0v4hPylUmore » « lessFree, publicly-accessible full text available November 20, 2026
- 
            Ethical concerns around AI have increased emphasis on model auditing and reporting requirements. We thoroughly review the current state of governance and evaluation prac- tices to identify specific challenges to responsible AI devel- opment in OSS. We then analyze OSS projects to understand if model evaluation is associated with safety assessments, through documentation of limitations, biases, and other risks. Our analysis of 7902 Hugging Face projects found that while risk documentation is strongly associated with evaluation practices, high performers from the platform’s largest com- petitive leaderboard (N=789) were less accountable. Recog- nizing these delicate tensions from performance incentives may guide providers in revisiting the objectives of evaluation and legal scholars in formulating platform interventions and policies that balance innovation and responsibility.more » « lessFree, publicly-accessible full text available October 20, 2026
- 
            Large Language Models (LLMs) have become pivotal in reshaping the world by enabling advanced natural language processing tasks such as document analysis, content generation, and conversational assistance. Their ability to process and generate human-like text has unlocked unprecedented opportunities across different domains such as healthcare, education, finance, and more. However, commercial LLM platforms face several limitations, including data privacy concerns, context size restrictions, lack of parameter configurability, and limited evaluation capabilities. These shortcomings hinder their effectiveness, particularly in scenarios involving sensitive information, large-scale document analysis, or the need for customized output. This underscores the need for a tool that combines the power of LLMs with enhanced privacy, flexibility, and usability. To address these challenges, we present EvidenceBot, a local, Retrieval-Augmented Generation (RAG)-based solution designed to overcome the limitations of commercial LLM platforms. Evidence-Bot enables secure and efficient processing of large document sets through its privacy-preserving RAG pipeline, which extracts and appends only the most relevant text chunks as context for queries. The tool allows users to experiment with hyperparameter configurations, optimizing model responses for specific tasks, and includes an evaluation module to assess LLM performance against ground truths using semantic and similarity-based metrics. By offering enhanced privacy, customization, and evaluation capabilities, EvidenceBot bridges critical gaps in the LLM ecosystem, providing a versatile resource for individuals and organizations seeking to leverage LLMs effectively.more » « lessFree, publicly-accessible full text available June 23, 2026
- 
            In the rapidly evolving domain of software engineering (SE), Large Language Models (LLMs) are increasingly leveraged to automate developer support. Open source LLMs have grown competitive with pro- prietary models such as GPT-4 and Claude-3, without the associated financial and accessibility constraints. This study investigates whether state of the art open source LLMs including Solar-10.7B, CodeLlama-7B, Mistral-7B, Qwen2-7B, StarCoder2-7B, and LLaMA3-8B can generate responses to technical queries that align with those crafted by human experts. Leveraging retrieval augmented generation (RAG) and targeted fine tuning, we evaluate these models across critical performance dimen- sions, such as semantic alignment and contextual fluency. Our results show that Solar-10.7B, particularly when paired with RAG and fine tun- ing, most closely replicates expert level responses, o!ering a scalable and cost e!ective alternative to commercial models. This vision paper high- lights the potential of open-source LLMs to enable robust and accessible AI-powered developer assistance in software engineering.more » « lessFree, publicly-accessible full text available May 23, 2026
- 
            Recent work on open source sustainability shows that successful trajectories of projects in the Apache Software Foundation Incubator (ASFI) can be predicted early on, using a set of socio-technical measures. Because OSS projects are socio-technical systems centered around code artifacts,we hypothesize that sustainable projects may exhibit different code and process patterns than unsustainable ones, and that those patterns can grow more apparent as projects evolve over time. Here we studied the code and coding processes of over 200 ASFI projects, and found that ASFI graduated projects have different patterns of code quality and complexity than retired ones. Likewise for the coding processes – e.g., feature commits or bug-fixing commits are correlated with project graduation success. We find that minor contributors and major contributors (who contribute <5%, respectively >=95% commits) associate with graduation outcomes, implying that having also developers who contribute fewer commits are important for a project’s success. This study provides evidence that OSS projects, especially nascent ones, can benefit from introspection and instrumentation using multidimensional modeling of the whole system, including code, processes, and code quality measures, and how they are interconnected over time.more » « less
- 
            Open-source Software (OSS) has become a valuable resource in both industry and academia over the last few decades. Despite the innovative structures they develop to support the projects, OSS projects and their communities have complex needs and face risks such as getting abandoned. To manage the internal social dynamics and community evolution, OSS developer communities have started relying on written governance documents that assign roles and responsibilities to different community actors. To facilitate the study of the impact and effectiveness of formal governance documents on OSS projects and communities, we present a longitudinal dataset of 710 GitHub-hosted OSS projects with GOVERNANCE.MD governance files. This dataset includes all commits made to the repository, all issues and comments created on GitHub, and all revisions made to the governance file. We hope its availability will foster more research interest in studying how OSS communities govern their projects and the impact of governance files on communities.more » « less
 An official website of the United States government
An official website of the United States government 
				
			 
					 
					
 
                                     Full Text Available
                                                Full Text Available